Closed Sequential Pattern Mining in High Dimensional Sequences

نویسندگان

  • Meng Han
  • Zhihai Wang
  • Jidong Yuan
چکیده

High dimensional sequences, such as biological sequences, are characterized by a small number of transactions, and a large number of items in each transaction. Mining sequential patterns in the sequences need to consider different forms of patterns, such as contiguous patterns, local patterns which appear more than one time in a special sequence, and so on. Mining closed patterns might lead to not only a more compact complete result set, but also better efficiency. In this paper, a novel algorithm based on BIDE (BI-Directional Extension) and multi-support is presented for high dimensional sequences specifically. It mainly mines three types of closed sequential patterns which are sequential patterns, local sequential patterns and total sequential patterns. Thorough experimental performances on biological sequences have demonstrated that the proposed algorithm could reduce memory consumption and generate more compact patterns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining closed and multi-supports-based sequential pattern in high-dimensional dataset

Previous mining algorithms on high dimensional datasets, such as biological dataset, create very large patterns sets as a result which includes small and discontinuous sequential patterns. These patterns do not bear any useful information for usage. Mining sequential patterns in such sequences need to consider different forms of patterns, such as contiguous patterns, local patterns which appear...

متن کامل

A Framework for Mining Closed Sequential Patterns

Sequential pattern mining algorithms developed so far provide better performance for short sequences but are inefficient at mining long sequences, since long sequences generate a large number of frequent subsequences. To efficiently mine long sequences, closed sequential pattern mining algorithms have been developed. These algorithms mine closed sequential patterns which don’t have any super se...

متن کامل

Sequential Pattern Mining by Pattern-Growth: Principles and Extensions

Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP...

متن کامل

Extracting Feature Sequences in Software Vulnerabilities Based on Closed Sequential Pattern Mining

Feature Extraction is significant for determining security vulnerabilities in software. Mining closed sequential patterns provides complete and condensed information for non-redundant frequent sequences generation. In this paper, we discuss the feature interaction problem and propose an efficient algorithm to extract features in vulnerability sequences. Each closed sequential pattern represents...

متن کامل

Distributed Sequential Pattern Mining: A Survey and Future Scope

Distributed sequential pattern mining is the data mining method to discover sequential patterns from large sequential database on distributed environment. It is used in many wide applications including web mining, customer shopping record, biomedical analysis, scientific research, etc. A large research has been done on sequential pattern mining on various distributed environments like Grid, Had...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JSW

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013